#### CS202: COMPUTER ORGANIZATION

#### Lecture 3

#### **Instruction Set Architecture**

#### **Outline**

- Instruction set architecture
  - RISC vs. CISC
  - MIPS/ARM/x86
- Instructions:
  - Arithmetic instruction: add, sub, ...
  - Data transfer instruction: lw, sw, lh, sh, ...
  - Logical instruction: and, or, ...
  - Conditional branch beq, bne, ...
- Basic concepts:
  - Operands: register vs. memory vs. immediate
  - Numeric representation: signed, unsigned, sign extension
  - Instruction format: R-format vs. I-format

#### Instruction Set

- To command a computer's hardware, you must speak its language
  - ✓ Instructions: words of a computer's language
  - ✓ Instruction set: vocabulary of commands
- •Two forms of instruction set:
  - ✓ Assembly language: written by people
  - ✓ Machine language: read by computer
- A program (in say, C) is compiled into an executable program that is composed of machine instructions
  - ✓ This executable program must also run on future machines.
  - ✓ Each Intel processor reads in the same x86 instructions, but each processor handles instructions differently
- Java programs are converted into portable bytecode that is converted into machine instructions during execution (just-in-time compilation)

#### Instruction Set

- Instruction set of different machine are similar, because
  - ✓ they base on similar design principles
  - ✓ several basic operations are provided
  - ✓ computer designers have a common goal
- Design target:
  - ✓ easy to build the hardware and compiler
  - ✓ maximizing performance and minimizing cost and energy
- Important design principles:
  - ✓ keep the hardware simple the chip must only implement basic primitives and run fast
  - ✓ keep the instructions regular simplifies the decoding/ scheduling of instructions

#### Von Neumann Architecture



It is easy to see by formal-logical methods that there exist certain [instruction sets] that are in abstract adequate to control and cause the execution of any sequence of operations.... The really decisive considerations from the present point of view, in selecting an [instruction set], are more of a practical nature: simplicity of the equipment demanded by the [instruction set], and the clarity of its application to the actually important problems together with the speed of its handling of those problems.

stored program computer

Burks, Goldstine, and von Neumann, 1947



#### Stored-program computer:

instructions and data of many types can be stored in memory as numbers

Von Neumann architecture

#### Harvard Architecture usually embedded (eg. DSP)

- Two separated memories:
  - √ data memory
  - ✓ instructions memory
- Two separated buses: data bus and memory bus
- More efficient than von Neumann, widely used in embedded systems



#### Instruction Set

- All instruction set are similar
  - ✓ Once learn one, easy to pick up others
- We will use MIPS as an example
  - ✓ MIPS: Microprocessor without interlocked pipeline stages
  - ✓ History of MIPS

#### RISC vs CISC

- ✓ RISC: reduced instruction set computer, e.g. MIPS, ARM, PowerPC, RISC-V
- CISC: complex instruction set computer, e.g. x86

C] SC also bring a lot of idea from CISC. 9.49744



John Cocke, IBM The father of RISC



John L. Hennessy Inventor of MIPS

#### A Basic MIPS Instruction

C code:

$$a = b + c$$
;

Assembly code: (human-friendly machine instructions) add a, b, c # a is the sum of b and c

Translate the following C code into assembly code: a = b + c + d + e; operand %

### Example

```
C code a = b + c + d + e;

translates into the following assembly code:

add \underline{a}, \underline{b}, \underline{c} add a, b, c

add \underline{a}, \underline{b}, \underline{c} add \underline{a}, \underline{b}, \underline{c}

add \underline{a}, \underline{a}, \underline{d} or add \underline{f}, \underline{d}, \underline{e}

add \underline{a}, \underline{a}, \underline{e} add \underline{a}, \underline{a}, \underline{f}
```

- Instructions are simple: fixed number of operands (unlike C)
- A single line of C code is converted into multiple lines of assembly code
- Some sequences are better than others... the second sequence needs one more (temporary) variable f

### Subtract Example

C code 
$$f = (g + h) - (i + j);$$

Assembly code translation with only add and sub instructions:

## Subtract Example

C code f = (g + h) - (i + j); translates into the following assembly code:

```
add t0, g, h add f, g, h add t1, i, j or sub f, f, i sub f, t0, t1 sub f, f, j
```

 Each version may produce a different result because floating-point operations are not necessarily associative and commutative... more on this later

## Design Principle 1

- Simplicity favors regularity
  - Regularity makes implementation simpler
  - ✓ Simplicity enables higher performance at lower cost.

汇编只支持有限个操作,更加simple

#### **Operands**

- In C, each "variable" is a location in memory
- In hardware, each memory access is expensive if variable a is accessed repeatedly, it helps to bring the variable into an on-chip scratchpad and operate on the scratchpad (registers)
- To simplify the instructions, we require that each instruction (add, sub) only operate on registers
- Note: the number of operands (variables) in a C program is very large; the number of operands in assembly is fixed... there can be only so limited number of registers

## Registers

- 一个指令8bit,表示一个reg 5bit, 翻金bit需要示指令。
- The MIPS ISA has 32 registers (x86 has 8 registers) –
   Why not more? Why not less?
- Each register is 32-bit wide (modern 64-bit architectures have 64-bit wide registers)
- A 32-bit entity (4 bytes) is referred to as a word
- To make the code more readable, registers are partitioned as \$s0-\$s7 (C/Java variables), \$t0-\$t9 (temporary variables)...

## **Memory Operands**

 Values must be fetched from memory before (add and sub) instructions can operate on them



How is memory-address determined?

### **Memory Address**

• The compiler organizes data in memory... it knows the location of every variable (saved in a table)... it can fill in the appropriate mem-address for load-store instructions





## **Immediate Operands**

- An instruction may require a constant as input
- An immediate instruction uses a constant number as one of the inputs (instead of a register operand)

```
$s0, $zero, 1000
                         # the program has base address
addi
                         # 1000 and this is saved in $s0
                         # $zero is a register that always
                         # equals zero
     $s1, $s0, 0
                        # this is the address of variable a
addi
addi $s2, $s0, 4
                        # this is the address of variable b
addi $s3, $s0, 8
                        # this is the address of variable c
      $s4, $s0, 12
addi
                        # this is the address of variable d[0]
```

### **Memory Instruction Format**

The format of a load instruction:



a constant that is added to the register in brackets

## Example

Convert to assembly:

C code: d[3] = d[2] + a;

## Example

Convert to assembly:

```
C code: d[3] = d[2] + a;
```

Assembly: # addi instructions as before

```
lw $t0, 8($s4) # d[2] is brought into $t0 lw $t1, 0($s1) # a is brought into $t1 add $t0, $t0, $t1 # the sum is in $t0 sw $t0, 12($s4) # $t0 is stored into d[3]
```

Assembly version of the code continues to expand!

## Registers vs. Memory

- Registers are faster to access than memory
- Operating on memory data requires loads and stores
  - More instructions to be executed
- Compiler must use registers for variables as much as possible
  - Only spill to memory for less frequently used variables
  - Register optimization is important!

## Design Principles

- Design Principle 2: Smaller is faster
  - Register vs. memory
  - Number of registers is small
- Design Principle 3: Make the common case fast
  - Small constants are common
  - Immediate operand avoids a load instruction

## **Numeric Representations**

 Assume that the bits in register \$s0 are as follows, what is the value of s0?



How about this one?

### **Numeric Representations**

- Decimal 35<sub>10</sub>
- Binary 00100011<sub>2</sub>
- Hexadecimal (compact representation)

0x 23 or  $23_{hex}$ 

0-15 (decimal)  $\rightarrow$  0-9, a-f (hex)

## **Unsigned Binary Integers**

Given an n-bit number

$$x = x_{n-1} 2^{n-1} + x_{n-2} 2^{n-2} + \dots + x_1 2^1 + x_0 2^0$$

- Range: 0 to +2<sup>n</sup>-1
- Example
  - $0000\ 0000\ 0000\ 0000\ 0000\ 0000\ 0001_2$  $= 0 + ... + 0 \times 2^3 + 0 \times 2^2 + 1 \times 2^1 + 1 \times 2^0$

$$= 0 + ... + 0 + 0 + 2 + 1 = 3_{10}$$

- Using 32 bits
  - 0 to +4,294,967,295

## 2s-Complement Signed Integers

Given an n-bit number, define the value as follows:

$$x = -x_{n-1}2^{n-1} + x_{n-2}2^{n-2} + \dots + x_12^1 + x_02^0$$

- Range:  $-2^{n-1}$  to  $+2^{n-1}-1$
- Example
- Using 32 bits
  - -2,147,483,648 to +2,147,483,647

## Signed Negation

- 2's complement = 1's complement + 1
  - 1's complement means 1 → 0, 0 → 1
  - Using  $\overline{x}$  represent 1's complement

$$x + \overline{x} = 1111...111_2 = -1$$
  
 $\overline{x} + 1 = -x$ 

- Example: -2
  - +2 = 0000 0000 ... 0010<sub>2</sub>
  - $-2 = 1111 \ 1111 \dots \ 1101_2 + 1$ = 1111 \ 1111 \ \dots \ \ 1110\_2

## Sign Extension

- Representing a number using more bits
  - Preserve the numeric value
  - E.g. we copy 8-bit to register and then want to extend it to be 16-bit or 32-bit
- In MIPS instruction set
  - addi: extend immediate value
  - 1b, 1h: extend loaded byte/halfword
  - beq, bne: extend the displacement
- Replicate the sign bit to the left
  - c.f. unsigned values: extend with 0s
- Examples: 8-bit to 16-bit
  - +2: 0000 0010 => 0000 0000 0000 0010
  - –2: 1111 1110 => 1111 1111 1111 1110

#### **Instruction Formats**

Instructions are represented as 32-bit numbers (one word), broken into 6 fields

```
R-type instruction add $t0, $s1, $s2

000000 10001 10010 01000 00000 100000
6 bits 5 bits 5 bits 5 bits 6 bits
op rs rt rd shamt funct
opcode source source dest shift amt function
```

## Opcode and Registers

| CORE INSTRUCTI    | ON SE | T<br>FOR- |                            |       | OPCODE<br>/ FUNCT     |
|-------------------|-------|-----------|----------------------------|-------|-----------------------|
| NAME, MNEMO       | NIC   | MAT       | OPERATION (in Verilog)     |       | (Hex)                 |
| Add               | add   | R         | R[rd] = R[rs] + R[rt]      | (1)   | 0 / 20 <sub>hex</sub> |
| Add Immediate     | addi  | I         | R[rt] = R[rs] + SignExtImm | (1,2) | 8 <sub>hex</sub>      |
| Add Imm. Unsigned | addiu | I         | R[rt] = R[rs] + SignExtImm | (2)   | 9 <sub>hex</sub>      |
| Add Unsigned      | addu  | R         | R[rd] = R[rs] + R[rt]      |       | $0/21_{hex}$          |
| And               | and   | R         | R[rd] = R[rs] & R[rt]      |       | 0 / 24 <sub>hex</sub> |

| NAME      | NUMBER | USE                                                      | PRESERVEDACROSS<br>A CALL? |
|-----------|--------|----------------------------------------------------------|----------------------------|
| \$zero    | 0      | The Constant Value 0                                     | N.A.                       |
| \$at      | 1      | Assembler Temporary                                      | No                         |
| \$v0-\$v1 | 2-3    | Values for Function Results<br>and Expression Evaluation | No                         |
| \$a0-\$a3 | 4-7    | Arguments                                                | No                         |
| \$t0-\$t7 | 8-15   | Temporaries                                              | No                         |
| \$s0-\$s7 | 16-23  | Saved Temporaries                                        | Yes                        |

#### **Instruction Formats**

Instructions are represented as 32-bit numbers (one word), broken into 6 fields

| 1-type inst | truction | lw     | \$t0, 32(\$s1)  |
|-------------|----------|--------|-----------------|
| 100011      | 10001    | 01000  | 000000000100000 |
| 6 bits      | 5 bits   | 5 bits | 16 bits         |
| op          | rs       | rt     | constant        |
| opcode      | source   | dest   | constant        |

#### **Instruction Formats**

Instructions are represented as 32-bit numbers (one word), broken into 6 fields

| 1-type inst | truction | ado    | \$t0, \$s1, 32  |
|-------------|----------|--------|-----------------|
| 001000      | 10001    | 01000  | 000000000100000 |
| 6 bits      | 5 bits   | 5 bits | 16 bits         |
| ор          | rs       | rt     | constant        |
| opcode      | source   | dest   | constant        |

### Stored-Program Computer

- Today's computers are built on two key principles:
  - Instructions are represented as numbers.
  - Programs are stored in memory to be read or written, just like data.
- These principles lead to the stored-program concept.

## Design Principle 4

- Design Principle 4: Good design demands good compromises
  - Different formats complicate decoding, but allow 32-bit instructions uniformly
  - Keep formats as similar as possible

# **Logical Operations**

| Logical ops    | C operators | Java operators | MIPS instr |
|----------------|-------------|----------------|------------|
| Shift Left     | <<          | <<             | sll        |
| Shift Right    | >>          | >>>            | srl        |
| Bit-by-bit AND | &           | &              | and, andi  |
| Bit-by-bit OR  |             |                | or, ori    |
| Bit-by-bit NOT | ~           | ~              | nor        |

#### **Control Instructions**

- Conditional branch: Jump to instruction L1 if register1 equals register2: beq register1, register2, L1 Similarly, bne and slt (set-on-less-than)
- Unconditional branch:

```
j L1
jr $s0
```

#### Convert to assembly:

```
if (i == j)
    f = g+h;
else
    f = g-h;
```

#### Control Instructions: if else

- Conditional branch: Jump to instruction L1 if register1 equals to register2: beq register1, register2, L1 Similarly, bne and slt (set-on-less-than)
- Unconditional branch:

```
j L1
jr $s0
```

```
Convert to assembly: bne $s3, $s4, Else if (i == j) add $s0, $s1, $s2 f = g+h; j Exit else sub $s0, $s1, $s2 f = g-h; Exit:
```

### Loop

#### Convert to assembly:

```
while (save[i] == k)
i += 1;
```

i and k are in \$s3 and \$s5 and base of array save[] is in \$s6

```
Loop: sll $t1, $s3, 2
add $t1, $t1, $s6
lw $t0, 0($t1)
bne $t0, $s5, Exit
addi $s3, $s3, 1
j Loop
Exit:
```

| 7 | +  | ./2 | . وتر | .a   | -  |     | 2_ | a |    |    | . 1 |    |   |  |  |  |  |  |  |  |  |  |  |  |
|---|----|-----|-------|------|----|-----|----|---|----|----|-----|----|---|--|--|--|--|--|--|--|--|--|--|--|
| 个 | A. | 13  | X:    | טציו | De | りひ  | M  | 世 | CO | jt | + b | ne | - |  |  |  |  |  |  |  |  |  |  |  |
|   |    |     |       |      | 5  | lon | ,  |   |    | fo | y t |    |   |  |  |  |  |  |  |  |  |  |  |  |
|   |    |     |       |      |    |     |    |   |    | +  |     |    |   |  |  |  |  |  |  |  |  |  |  |  |
|   |    |     |       |      |    |     |    |   |    |    |     |    |   |  |  |  |  |  |  |  |  |  |  |  |
|   |    |     |       |      |    |     |    |   |    |    |     |    |   |  |  |  |  |  |  |  |  |  |  |  |
|   |    |     |       |      |    |     |    |   |    |    |     |    |   |  |  |  |  |  |  |  |  |  |  |  |
|   |    |     |       |      |    |     |    |   |    |    |     |    |   |  |  |  |  |  |  |  |  |  |  |  |
|   |    |     |       |      |    |     |    |   |    |    |     |    |   |  |  |  |  |  |  |  |  |  |  |  |
|   |    |     |       |      |    |     |    |   |    |    |     |    |   |  |  |  |  |  |  |  |  |  |  |  |
|   |    |     |       |      |    |     |    |   |    |    |     |    |   |  |  |  |  |  |  |  |  |  |  |  |
|   |    |     |       |      |    |     |    |   |    |    |     |    |   |  |  |  |  |  |  |  |  |  |  |  |
|   |    |     |       |      |    |     |    |   |    |    |     |    |   |  |  |  |  |  |  |  |  |  |  |  |
|   |    |     |       |      |    |     |    |   |    |    |     |    |   |  |  |  |  |  |  |  |  |  |  |  |
|   |    |     |       |      |    |     |    |   |    |    |     |    |   |  |  |  |  |  |  |  |  |  |  |  |
|   |    |     |       |      |    |     |    |   |    |    |     |    |   |  |  |  |  |  |  |  |  |  |  |  |
|   |    |     |       |      |    |     |    |   |    |    |     |    |   |  |  |  |  |  |  |  |  |  |  |  |
|   |    |     |       |      |    |     |    |   |    |    |     |    |   |  |  |  |  |  |  |  |  |  |  |  |
|   |    |     |       |      |    |     |    |   |    |    |     |    |   |  |  |  |  |  |  |  |  |  |  |  |
|   |    |     |       |      |    |     |    |   |    |    |     |    |   |  |  |  |  |  |  |  |  |  |  |  |
|   |    |     |       |      |    |     |    |   |    |    |     |    |   |  |  |  |  |  |  |  |  |  |  |  |
|   |    |     |       |      |    |     |    |   |    |    |     |    |   |  |  |  |  |  |  |  |  |  |  |  |